A Declarative Characterization of Different Types of Multicomponent Tree Adjoining Grammars
نویسنده
چکیده
Multicomponent Tree Adjoining Grammars (MCTAG) is a formalism that has been shown to be useful for many natural language applications. The definition of MCTAG however is problematic since it refers to the process of the derivation itself: a simultaneity constraint must be respected concerning the way the members of the elementary tree sets are added. This way of characterizing MCTAG does not allow to abstract away from the concrete order of derivation. In this paper, we propose an alternative definition of MCTAG that characterizes the trees in the tree language of an MCTAG via the properties of the derivation trees (in the underlying TAG) the MCTAG licences. This definition gives a better understanding of the formalism, it allows a more systematic comparison of different types of MCTAG, and, furthermore, it can be exploited for parsing. 1 TAG and MCTAG Tree Adjoining Grammar (TAG, [Joshi and Schabes, 1997]) is a tree-rewriting formalism. A TAG consists of a finite set of trees (elementary trees). The nodes of these trees are labelled with nonterminals and terminals (terminals only label leaf nodes). Starting from the elementary trees, larger trees are derived by substitution (replacing a leaf with a new tree) and adjunction (replacing an internal node with a new tree). In case of an adjunction, the tree being adjoined has exactly one leaf that is marked as the foot node (marked with an asterisk). Such a tree is called an auxiliary tree. When adjoining it to a node n, in the resulting tree, the subtree with root n from the old tree is attached to the foot node of the auxiliary tree. Non-auxiliary elementary trees are called initial trees. A derivation starts with an initial tree. In a final derived tree, all leaves must have terminal labels. For a sample derivation see Fig. 1. Definition 1 (Tree Adjoining Grammar) A Tree Adjoining Grammar (TAG) is a tuple G = 〈I, A,N, T 〉 with – N and T being disjoint finite sets, the nonterminals and terminals – I being a finite set of initial trees with nonterminals N and terminals T , and – A being a finite set of auxiliary trees with nonterminals N and terminals T . NP John S NP VP V laughs VP ADV VP∗ always derived S tree: NP VP John ADV VP always V laughs derivation tree: laugh 1 2 john always Fig. 1. TAG derivation for John always laughs Definition 2 (TAG derivation and tree language) Let G = 〈I, A,N, T 〉 be a TAG. Let γ and γ be finite trees. – γ ⇒ γ in G iff there is a node position p and a γ 0 that is either elementary or derived from some elemenentary tree such that γ = γ[p, γ 0]. 1 ∗ ⇒ is the reflexive transitive closure of ⇒. – The tree language of G is LT (G) := {γ | there is an α ∈ I such that α ∗ ⇒ γ and all leaves in γ have terminal labels}. TAG derivations are represented by derivation trees that record the history of how the elementary trees are put together. A derived tree is the result of carrying out the substitutions and adjunctions, i.e., the derivation tree describes uniquely the derived tree. Each edge in a derivation tree stands for an adjunction or a substitution. The edges are labelled with Gorn addresses. E.g., the derivation tree in Fig. 1 indicates that the elementary tree for John is substituted for the node at address 1 and always is adjoined at node address 2. Definition 3 (TAG derivation tree) Let G = 〈I, A,N, T 〉 be a TAG. Let γ be a tree derived as follows in G: γ = γ0[p1, γ1] . . . [pk, γk] where γ0 is an instance of an elementary tree and the substitutions/adjunctions of the γ1, . . . , γk are all the substitutions/adjunctions to γ0 that are performed to derive γ. Then the corresponding derivation tree has a root labelled with γ0 that has k daugthers. The edges from γ0 to these daughters are labelled with p1, . . . , pk, and the daughters are the derivation trees for the derivations of γ1, . . . , γk. A TAG extension that is useful for linguistic applications is multicomponent TAG (MCTAG, [Weir, 1988]). An MCTAG contains sets of elementary trees. In each derivation step, one of the tree sets is chosen and its trees are added simultaneously. Depending on the nodes to which the trees from the set attach, different kinds of MCTAGs are distinguished: if the nodes are required to be part of the same elementary tree, the MCTAG is tree-local, if they are required to be part of the same tree set, the grammar is set-local and otherwise it is non-local. 1 For trees γ, γ1, . . . , γn and pairwise different node positions p1, . . . , pn in γ, γ[p1, γ1] . . . [pn, γn] denotes the result of subsequently substituting/adjoining the γ1, . . . , γn to the nodes in γ with addresses p1, . . . , pn respectively. 2 The root address is ǫ, and the jth child of a node with address p has address pj.
منابع مشابه
PreRkTAG: Prediction of RNA Knotted Structures Using Tree Adjoining Grammars
Background: RNA molecules play many important regulatory, catalytic and structural <span style="font-variant: normal; font-style: norma...
متن کاملGenerating LTAG grammars from a lexicon/ontology interface
This paper shows how domain-specific grammars can be automatically generated from a declarative model of the lexicon-ontology interface and how those grammars can be used for question answering. We show a specific implementation of the approach using Lexicalized Tree Adjoining Grammars. The main characteristic of the generated elementary trees is that they constitute domains of locality that sp...
متن کاملMulti-Component Tree Insertion Grammars
In this paper we introduce a new mildly context sensitive formalism called Multi-Component Tree Insertion Grammar. This formalism is a generalization of Tree Insertion Grammars in the same sense that Multi-Component Tree Adjoining Grammars is a generalization of Tree Adjoining Grammars. We show that this class of grammatical formalisms is equivalent to Multi-Component Tree Adjoining Grammars, a...
متن کاملA Tree Transducer Model for Synchronous Tree-Adjoining Grammars
A characterization of the expressive power of synchronous tree-adjoining grammars (STAGs) in terms of tree transducers (or equivalently, synchronous tree substitution grammars) is developed. Essentially, a STAG corresponds to an extended tree transducer that uses explicit substitution in both the input and output. This characterization allows the easy integration of STAG into toolkits for exten...
متن کاملTree-Local Multicomponent Tree-Adjoining Grammars with Shared Nodes
This article addresses the problem that the expressive power of tree-adjoining grammars (TAGs) is too limited to deal with certain syntactic phenomena, in particular, with scrambling in freeword-order languages. The TAG variants proposed so far in order to account for scrambling are not entirely satisfying. Therefore, the article introduces an alternative extension of TAG that is based on the n...
متن کاملAlternating Regular Tree Grammars in the Framework of Lattice-Valued Logic
In this paper, two different ways of introducing alternation for lattice-valued (referred to as {L}valued) regular tree grammars and {L}valued top-down tree automata are compared. One is the way which defines the alternating regular tree grammar, i.e., alternation is governed by the non-terminals of the grammar and the other is the way which combines state with alternation. The first way is ta...
متن کامل